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Abstract 

Climate data and distribution data for the Canadian waterweed Flodea canadensis Michx. from North 
America, whole Europe and Finland were used to investigate the ability of bioclimatic envelope models 
to predict the distribution range and recent northward range shift of the species in Europe. Four different 
main types of models were developed using the North American data, including either three ‘baseline’ 
climate variables (growing degrees days, temperature of the coldest month, water balance) or an extended 
set of seven climate variables, both averaged either over a 30 year time slice or a longer 90 year time slice. 
Ten different random selections of pseudo-absences were generated from the North American data, on the 
basis of which ten separate generalized additive models (GAMs) were developed for each main model type. 
All the 40 developed GAMs were applied first to North America and then transferred to whole Europe and 
Finland. All the models showed a statistically highly significant accuracy in the three study areas. Although 
the differences among the four main model types were only minor, the two extended model types showed 
on average statistically better performance than the two baseline models based on Bayesian information 
criterion (BIC) values, the amount of deviance explained by the models, resubstitution validation and 
four-fold cross-validation in North America. They also provided slightly more accurate predictions of 
climatically suitable area for Elodea canadensis in Finland both in 1961-1984 and 1985-2006. However, 
the projections from the individual extended models were more variable than projections from the baseline 


models. Thus model predictions based on a variety of predictor variables but only one selection of pseudo- 
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absences may be subject to biases, and outputs from multiple models should be investigated to better 
account for uncertainties in modelling. Overall, our results suggest that more attention should be paid to 
the careful selection of predictor variables and the use of multiple pseudo-absence sets in the ecological 


niche modelling in order to increase the reliability of the projections of the range shifts of invasive species. 
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Introduction 


Invasive species are recognized as a major environmental problem which can have 
manifold ecological impacts (Mack et al. 2000, Weidema 2000, Peterson 2003a), cause 
high economical costs (Forman 2003, Pimentel 2005), change hydrological cycles, fire 
regimes and nutrient cycling, and cause significant environmental damage (Forman 
2003). When successfully established into a new area, invasive species can displace 
populations of native species, threaten rare species and ultimately cause local extinc- 
tions attributable to predation, grazing and habitat alteration (Forman 2003, Rahel 
and Olden 2008), and more rarely competition (Davis 2003, Sax and Gaines 2008). 

The spread of invasive species will probably be accelerated by the on-going and 
projected climate change (Dukes and Mooney 1999, Weber 2001, Hellmann et al. 
2008). The magnitude of the projected global warming is particularly high in northern 
latitudes (ACIA 2005), including northern Europe, and thus the likelihood of climate 
change—induced range shifts of invasive species is pronounced in such areas (Rahel and 
Olden 2008). In southern and central Europe, warming climate has already boosted 
the spread of many invasive species, e.g. palms (Walther et al. 2007) and other exotic 
evergreen broad-leaved plant species (Walther et al. 2001), and thermophilic tropical 
and Capensis ornamental plants (Vesperinas et al. 2001). Similar evidence is accu- 
mulating from northern Europe (Weidema 2000, ACIA 2005), but more systematic 
analyses of the observed range shifts of invasive species in relation to recent climatic 
changes are largely lacking. 

Identification of areas most at risk of becoming invaded by a given alien spe- 
cies and projections of the further spread of already naturalised species can provide 
valuable information for management planning (Weber 2001, Mau-Crimmins et al. 
2006) and targeting control measures (Kriticos et al. 2003, Richardson and Thuiller 
2007). One proactive approach to identify areas at risk is provided by ecological niche 
modelling (Weber 2001, Roura-Pascual et al. 2004, Ficetola et al. 2007). The main 
steps in ecological niche modelling include: (i) relating the known occurrences of the 
target species to the ecological characteristics of the study landscape, (ii) producing 
a model that defines the ecological dimensions of the species niche, and (iii) project- 
ing the derived ecological niche model back onto the geographical space to identify 
regions with environmental conditions inside or outside the species’ niche (Peterson 


and Vieglais 2001). 
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Ecological niche models can utilize many different environmental predictors such 
as climate, topography, soil classes and land cover (Peterson et al. 2003, Iguchi et al. 
2004, Mau-Crimmins et al. 2006). However, at broad macroecological scales climate 
variables are often the only predictors available over large areas, and climate also largely 
determines species distributions (Thuiller et al. 2004, Luoto et al. 2007). Under such 
circumstances ecological niche modelling becomes materially the same as bioclimatic 
envelope modelling (Pearson and Dawson 2003, Guisan and Thuiller 2005). Indeed, 
increasing numbers of broad-scale applications of ecological niche models developed 
for invasive species have been based on climate variables (e.g. Beerling et al. 1995, 
Baker et al. 2000, Broennimann et al. 2007). This study also focuses on broad-scale 
species - climate models and the ‘first-filter’ identification of the areas potentially at 
risk of being invaded (Weber 2001, Welk 2004). 

However, certain factors may decrease the usefulness of bioclimatic envelope models 
in modelling invasive species (Pearson and Dawson 2003, Thuiller 2004, Luoto et al. 
2005, 2007, Heikkinen et al. 2006a, 2007). In this study we address three potential 
limitations. First, the selection of climate parameters may significantly affect the per- 
formance of the species — climate models (Beaumont et al. 2005, 2007, Heikkinen 
et al. 2006b, Loiselle et al. 2008, Peterson and Nakazawa 2008). Increasing attention 
should thus be paid to careful selection of climatic variables in order to model and assess 
potential future species distributions as accurately as possible (Heikkinen et al. 2006b, 
Beaumont et al. 2007, Peterson and Nakazawa 2008). Second, bioclimatic modelling 
studies often show a mismatch between the time slice over which the climate data is av- 
eraged and the time slice when species records have been collected, but it is insufficiently 
understood whether this affects model performance. Such mismatches are common in 
studies employing plant atlas data bases (such as Atlas Florae Europaeae; Jalas and Suom- 
inen 1988; http://www.fmnh.helsinki.fi/english/botany/afe/), which often include ag- 
glomerative records from several decades or even centuries (e.g. Beerling et al. 1995, 
Huntley et al. 1995, Sykes et al. 1996). Third, certain modelling methods require both 
presence and absence data. Many recent studies have adopted a strategy of selecting a set 
of pseudo-absences from the overall set of assumed absence data points to be used in the 
model calibration (e.g. McPherson et al. 2004, Guisan et al. 2007). The pseudo-absence 
approach may be a particularly attractive option when the modelling is based on atlases, 
museum data and databases. Such data sources often do not provide detailed enough 
information about the recording effort in the sites where species has not been detected, 
and consequently, false absences can be included in the models which decreases the reli- 
ability of their predictions (Chefaoui and Lobo 2008). However, models based on only 
one set of pseudo-absences may be vulnerable to sporadic biases in the selection process 
(Engler et al. 2004). Developing multiple models based on different sets of pseudo- 
absences is thus preferable (Thomaes et al. 2008). However, it is poorly known whether 
increasing the number of predictor variables used in the modelling increases the vari- 
ability among projections from the models based on different pseudo-absence data sets. 

Modelling studies with freshwater invasive species, especially invasive aquatic plant 
species, are more sparse than studies using terrestrial species (Dominguez-Dominguez 
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et al. 2006; but see Peterson et al. 2003), although invaders can have dramatic effects 
on freshwater communities (Kozhova and Izhboldina 1993, Simon and Townsend 
2003). In this study we investigate the potentiality of bioclimatic envelope models to 
provide useful predictions for an invasive aquatic plant species, the Canadian water- 
weed Elodea canadensis Michx., in Europe, and to predict recent changes in its distribu- 
tion range in northern Europe, in Finland, with respect to the climate. We specifically 
investigate the importance of the selection of climatic predictors and the delimitation 
of time slice over which climate data is averaged for the model performance. Model- 
ling of terrestrial plant species has often focused on similar types of key variables, such 
as the mean temperature of the coldest month, growing degree day sum above a 5°C 
threshold, and the ratio of actual to potential evaporation (Huntley et al. 1995, Sykes 
et al. 1996). We study here how useful these three ‘baseline’ climate variables are in 
modelling the distribution of Elodea canadensis in comparison to an ‘extended’ set of 
climate variables including four other climate parameters potentially better reflecting 
some critical aspects of the biology of Elodea. The main questions of this study are: (1) 
how successful are bioclimatic envelope models in predicting the distribution area and 
recent northward spread of Elodea canadensis in Europe?; (2) are there differences in the 
performance of models based on medium-term vs. long-term climate data, and models 
including three baseline climate variables vs. an extended set of climate variables?, (3) 
which climate variables are the best predictors of the distribution of Elodea canadensis?, 
and (4) does the model performance vary between the different models based on differ- 
ent sets of pseudo-absences, and is this variation more notable between models with an 
extended set of climate variables than between models with the three climate variables? 


Methods 


The study species 
The study species, the Canadian waterweed Elodea canadensis Michaux, is a member of 
the family Hydrocharitaceae (Simpson 1984, Cook and Urmi-K6nig 1985). Elodea ca- 
nadensis is a submerged aquatic plant which is native only in the New World. The spe- 
cies occurs in inland lakes, ponds and slowly moving waters in rivers, streams and canals 
(Cook and Urmi-K6nig 1985). It prefers cool water temperatures (tolerance ranging 
between 10—25°C), and calcium-rich eutrophic water (pH 6.5—10). In northern Eu- 
rope it grows mainly in relatively firm, nutrient-rich sediments with a high mineral 
content (Weidema 2000). Elodea canadensis is able to form dense single-species stands 
and become a dominant species in water 0.1—1.5 m deep (Cook and Urmi-K6nig 1985, 
Kozhova and Izhboldina 1993). It tolerates relatively high levels of light, but not frost. 
The species is able to recommence growth as soon as the temperature rises in spring. It 
fragments easily and disperses effectively by vegetative means, as the fragments have a 
high survival rate (Cook and Urmi-Ko6nig 1985, Barrat-Segretain et al. 2002). 

In optimal growing conditions Elodea canadensis can be a troublesome species. 
Dense stands of Elodea reduce temperature and oxygen concentrations of water, and 
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decomposing stands cause internal nutrient loading (Cook and Urmi-K6nig 1985, 
Weidema 2000). In northern Europe, mass occurrences of the species may alter the 
whole lake ecosystem and turn the water hyper-eutrophic and muddy. In Norway, such 
mass occurrences have very probably caused disappearances of red listed plant spe- 
cies populations that inhabited certain lakes and ponds before the invasion of Elodea 
canadensis (Weidema 2000, pp. 98-99). Interestingly, after the establishment of the 
species in a given waterbody a cyclical trend has often occurred. Within the first 3-4 
years, the species attains a pest position during which it can effectively exclude other 
macrophytes. However, after the next 3 to 10 years the populations often decline stead- 
ily, and thereafter the species remains as a small relict population, or may disappear for 
some time (Simpson 1984). 

Several different dispersal mechanisms have been suggested for Elodea canadensis, 
including deliberate translocations in (botanical) gardens, aquarium trade, and frag- 
ments carried passively over with timber material or small recreational boats (Simp- 
son 1984, Cook and Urmi-Konig 1985, Weidema 2000, Pienimaki and Leppakoski 
2004). In addition, the long-lasting fragments of the species disperse effectively via 
watercourses and may also be transported by waterfowl from one lake to another. 
Remarkably, only female plants of Elodea canadensis occur in northern Europe, mean- 
ing that there it is dispersed only by vegetative means, i.e. mainly via fragments that 
become rooted (Weidema 2000, p. 98). 


Distribution data and range shifts 

Elodea canadensis is (for most parts) native and widespread in temperate North Ameri- 
ca, the core distribution area extending in the north to ca. 55°N in Canada and south- 
wards to about 35°N in Alabama, USA. The main occurrences of the species concen- 
trate around the Great Lakes and the St. Lawrence Valley (Cook and Urmi-Konig 
1985). The distribution of Elodea canadensis in North America was extracted from 
three sources: (i) the map published by Cook and Urmi-K6nig (1985), (ii) Flora of 
North America, Vol. 22, Hydrocharitaceae (Committee 1993+; accessed via http:// 
www.fna.org/FNA/), and (iii) the species distribution data base governed by USDA 
(United States Department of Agriculture; http://plants.usda.gov/). The presence 
records from these three sources were agglomerated and re-sampled into a lattice sys- 
tem using grid cells 0.5° x 0.5° in size, and a geographical window ranging from 20°N, 
140°W to 70°N, 52°W. However, as the species occurs predominantly in inland water 
bodies, only mainland areas in North America and parts of Mexico from this window 
were included in the actual modelling (Fig 1a), resulting in a set of 9701 grid cells from 
which 2015 cells had the species. 

In Europe, Elodea canadensis was introduced first to Northern Ireland in 1836, 
then in the 1840s to Scotland and England, and from 1850 onwards it spread rapidly 
over the British Isles (Simpson 1984, Cook and Urmi-Konig 1985). In 1850-1860, 
the species was introduced to Belgium, Germany and the Netherlands, from where it 
spread rapidly to several other central European countries. In the Nordic Countries, 
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except Norway, the species was first recorded in the late 19° century. The distribution 
data for Elodea canadensis in Europe were taken from Hultén and Fries (1986), and 
digitized using a lattice system with cells of 0.5° x 0.5° in size. The European window 
used in this study ranged from 34.5°N, 10.5°W to 71.5°N, 45.0°E. Grid cells occur- 
ring in the sea areas were excluded from the data set, and 5083 grid cells were selected 
in the final European data set, including 1881 cells with known occurrences (Fig. 1b). 

In Finland, Flodea canadensis was first planted in the Botanical Garden in Helsinki 
and other corresponding places but its spreading in Finland became rapid and aggressive 
only in the 20" century (Weidema 2000). By 1920, the species was recorded from several 
locations in southern Finland (Hintikka 1917). Since then, it has continued to expand its 
distribution range and has recently been recorded from relatively northern water bodies 
(Fig. 1c). The distribution data of Elodea canadensis in Finland was derived from the na- 
tional atlas data base ‘Kastikka’ for vascular plants (Lampinen and Lahti 2007; http://www. 
luomus.fi/kasviatlas). The known presence points of the species were recorded using a uni- 
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Figure I. Distribution of Flodea canadensis in the three study areas. (a) Grid cells in North America 
with known occurrences (n = 2015, in dark grey) and cells in which the species has not been recorded (n = 
7686, in light grey), (b) grid cells in Europe with known occurrences (n = 1881) and cells with no known 
occurrences (n = 3202), and (c) grid cells in Finland in 1961-1984 (in dark grey) and 1985-2006 (in 
black) with known occurrences (n = 276 and 375) and cells with no known occurrences (in light grey). 
For all the 10 data sets used in calibrating the models in North America, 2015 random pseudo-absence 


points were selected from the 7686 (presumed) absence grid cells. 
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form grid system with grid cells of 10 x 10 km in size (n = 3544) (Fig. 1c). These records 
were assigned into two temporally delimited data sets: records made in 1960-1984 and in 
1985-2006. Based on the floristic resurveys of known occurrences, we made an assump- 
tion that all the 10-km grid cells that had occurrences of the species in the earlier surveys 
still had the species in the later time periods. Thus all the records made in 1960-1984 were 
also considered as valid positive records in 1985-2006, and records made before 1960 
were included both in the 1960-1984 and the 1985-2006 species data. We acknowledge 
here that this assumption might be unrealistic for many short-lived species. However, 
for the occurrences of Elodea canadensis it is reasonable. This is because the species is able 
to develop long-lasting populations, regenerate vegetative and spread effectively within a 
particular waterbody (Cook and Urmi-K6nig 1985, Kozhova and Izhboldina 1993). ‘The 
data from the national atlas data base ‘Kastikka for vascular plants, as well as the empirical 
lake monitoring data collected by one of the authors (HT), suggest that the species is able 
to persist in the same regions or even the same lakes in Finland for at least 50-60 years. 


Climate data 

Mean monthly precipitation and temperature values on a grid with 0.5° x 0.5° spatial 
resolution for North America and Europe matching the species data were extracted 
from the Climatic Research Unit (CRU) TS 2.0 dataset (New et al. 2002, Mitchell et 
al. 2003), and averaged over two time periods. The two time slices in North America 
were 1901-1990 (‘long-term’ climate data) and 1961-1990 (‘medium-term’ climate 
data), and in Europe 1901-1980 and 1951-1980, delimited to match in both conti- 
nents with the probably latest recording years in the species data. Climate data for the 
years 1961—2006 on a 10 x 10 km grid covering the whole Finland was provided by 
the Finnish Meteorological Institute (Vendlainen et al. 2005) and averaged over two 
time slices corresponding with to those of the species data, 1961-1984 and 1985-— 
2006 (interpolated values for Finland were not available before 1961). 

For each of the three geographical areas and all the different time slices, we calcu- 
lated two ‘competing’ sets of climate predictor variables. ‘The first data set consisted of 
three ‘baseline’ climate variables that are considered to be among the most important 
broad-scale determinants of the ranges of terrestrial plants: (i) mean temperature of the 
coldest month (MTCO), growing degree days above 5°C (GDD5) and water balance 
(WB) (see Beerling et al. 1995, Huntley et al. 1995, Sykes et al. 1996). GDD5 were 
derived by estimating daily values from monthly mean temperatures using a sine curve 
interpolation (Brooks 1943). The water balance was calculated as the annual sum of 
the monthly differences between precipitation and potential evapotranspiration fol- 
lowing Skov and Svenning (2004). The formulas applied were the following: 


wB=>(P ~ PET) (1) 

where, P= mean precipitation in month i 

PET, = mean potential evapotranspiration in month i = (58.93xT)/12 [if T/> 0 °C, 
else PET = 0] 


where, 7, = mean temperature in month i 
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In the second case, we complemented the three baseline variables with four ad- 
ditional variables in order to investigate whether the extended set of climate variables 
would provide more accurate projections of the climatically suitable areas for the spe- 
cies. The four additional variables aimed to reflect some critical stage in the life cycle 
of Elodea canadensis, including: (i) mean temperature in spring (Iemp,,,,,; March, 
April, May), (ii) mean July temperature (Temp,), (iii) water deficit in (late) summer 
(‘Defi_sum’; July, August, September), and annual water deficit (“Defi_ann’). Monthly 
water deficit values were calculated following Ohlemiiller et al. (2006). The formula 
for measuring water deficit was almost the same as that used to calculate the water bal- 
ance (Equation 1), the difference being that only those values and months are taken 
into account where PET, exceeds the precipitation (P?.), otherwise P,— PET. is let be- 
ing 0. These monthly values were summed for July — September and for the whole year. 

The first reasoning for including these variables was that mean temperatures in 
spring can be an important determinant for the distribution of Elodea canadensis. 
This is because the species is able to regenerate actively soon after the temperatures 
increase (Barrat-Segretain et al. 2002). Second, one factor potentially governing the 
southern range margin of the species is the water temperature during the warmest 
part of the growing season. In this study we used the July temperature and the water 
deficit in late summer as surrogates for direct measurements of the temperature in 
inland water bodies. Earlier studies have found especially the July air temperature to 
be a good predictor of maximum surface-water temperatures (Mohseni et al. 2003, 
Sharma et al. 2007). Water deficit in late summer summarises the interactions be- 
tween temperature and precipitation and thus also has the potentiality to indicate 
landscapes where the water levels in many lakes can become low, and consequently 
growing conditions can become more readily overly warm for Elodea canadensis. 
Annual water deficiency provides an additional indication of the areas which may 
face an accumulative water deficit and heating effect and thus show high maximum 
surface-water temperatures. 


Statistical analysis 

We used generalized additive models (GAMs) in the bioclimatic envelope modelling of 
Elodea canadensis. Generalized additive models are flexible data-driven non-parametric 
extensions of generalized linear models (Hastie and Tibshirani 1990) that allow both 
linear and complex additive response curves to be fitted (Wood and Augustin 2002). 
All the GAM models were developed using the GRASP (Lehmann et al. 2003) user 
interface in S-Plus (Version 6.1 for Windows, Insightful Corp.). 

The modelling process included several separate steps. In the very first step, follow- 
ing Beaumont et al. (2009), ten random selections of 2015 pseudo-absence grid cells 
were taken in North America from the 7686 grid cells with no records of Elodea. The 
2015 grid cells with known presences were added into each of the ten random draws, 
and thus all the 10 combined random sets had the recommendable prevalence of 50% 
(McPherson et al. 2004, Meynard and Quinn 2007). Next, all the random sets were 
calibrated four times, i.e. using the species data and the four different climate data 
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sets. In total, 40 different GAMs were developed, by applying the 10 random data sets 
separately into four main types of GAM: (1) medium-term (1961-1990) climate data 
including 3 baseline variables, (2) long-term (1901-1990) climate data including 3 
baseline variables, (3) medium-term climate data including an extended set of 7 vari- 
ables, and (4) long-term climate data including an extended set of 7 variables. 

All the GAMs were built using a stepwise procedure to select relevant explana- 
tory variables and the level of complexity of the response shapes. A starting model 
including all continuous predictors smoothed with 3 degrees of freedom was fitted 
first. Following Bio et al. (2002), the variable dropping or conversion to linear form 
was tested using Bayesian information criterion (BIC) (Johnston and Omland 2004), 
which is more selective than the widely used Akaike’s information criterion (AIC). All 
the predictor variables that were selected in the final model were required to make a 
model contribution (i.e. contribution of a given predictor within the selected models, 
as measured by GRASP; see Lehmann et al. 2003) of 5% or more. Moreover, from 
the pairs of highly (>0.90) correlated variables the one with lower model contribution 
was excluded from the model. Because the response variables represented binary data 
(presence or absence of the species), a binomial distribution of error via a logistic link 
function was applied (Lehmann et al. 2003). 

The performance of the 40 GAMs in predicting the distribution of Elodea canaden- 
sis in North America was evaluated by four measures: (i) BIC (smaller values are indica- 
tive of a better fit to the data) (Venables and Ripley 2002), (ii) the amount of deviance 
explained (D7) (i.e. the ratio of the explained deviance to the total deviance), (iii) the 
resubstitution method (or ‘simple validation’, see Lehmann et al. 2003) based on a 
plot of observed response values against the values predicted by the model, and the 
subsequent area under the curve (AUC) of a receiver operating characteristic (ROC) 
plot (Fielding and Bell 1997), and (iv) four-fold cross-validation, carried out with four 
random subsets of the entire dataset. In the four-fold cross-validation, each randomly 
selected subset was dropped from the model, the model was recalculated and predic- 
tions were made for the omitted data points. Combination of the predictions from the 
different subsets was then plotted against the observed data (Lehmann et al. 2003), 
and model performance was measured using the AUC of the ROC plot. The following 
interpretation of AUC-values was used (Swets 1988): AUC>0.9: excellent agreement 
between observed and predicted distribution; 0.8<AUC<0.9: good model accuracy; 
0.7<AUC<0.8: fair; 0.6<AUC<0.7: poor; AUC<0.6: fail. Differences between the four 
main types of GAMs with respect to the four model performance measures were ana- 
lysed using a paired t-test (Quinn and Keough 2002). 

In the final part of analysis of the North American data, probability values were 
generated for the occurrence of Elodea canadensis in all the 9701 grid cells by fitting the 
developed 40 models to this full data set. The geographical patterns and variability of 
the probability values between the four main types of GAMs were visually investigated, 
both by comparing the mean probability of occurrence values averaged for each grid 
cell over the 10 random GAMs and their standard deviation, and by investigating the 
probabilities from the individual random GAMs. 
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In the second main step of the modelling, the 40 random GAMs calibrated using 
the North American data were fitted to the European data sets. Transferring of the 
models was done between the corresponding pairs of climate data sets, i.e. the 10 ran- 
dom GAMs based on the 1901-1990 climate data set and the three climate variables 
(MTCO, GDD5, WB) from North America were projected to the European climate 
data set averaged over 1901-1980 and including the same three variables, and so forth. 
The probability values for the occurrence of Elodea canadensis derived from the trans- 
ferred models were compared with the distribution records extracted from Hultén and 
Fries (1986). As some parts of Europe were probably undersampled, we made the as- 
sumption that absence data was not available here (cf. Thuiller et al. 2005). 

The accuracies of the transferred models were tested separately for each of the 
four main model types in Europe. Using a chi-square test, we compared the number 
of known presences of Elodea situated in areas predicted to be climatically suitable 
for the species by <80% (<8 models out of 10; the four main GAM types were tested 
separately) of the models versus the number of known presences in areas predicted 
suitable by >80% of the models (cf. Herborg et al. 2007). Prior to the chi-square tests, 
the probabilities generated by the GAMs were transformed into presence and absence 
values using a cut-off level defined by the prevalence of species in the model calibration 
data (here 0.50) (see McPherson et al. 2004, Liu et al. 2005). In addition to chi-square 
tests, the mean probability values in the 1881 grid cells with known occurrences of 
Elodea canadensis were calculated separately for the four main types of GAMs and 
their differences were compared. Finally, the geographical patterns and variability of 
the probability values in the full European data set with 5083 grid cells were visually 
investigated to reveal potential differences between the projections from the four main 
types of GAMs. 

In the third main step of modelling, the 40 GAMs calibrated using North Ameri- 
can data were fitted to the Finnish climate sets, both for the time periods of 1961-1984 
and 1985-2006 and using the two types of climate predictor sets. The performance of 
the transferred GAMs with the Finnish data was evaluated as described for the whole 
European data sets. 


Results 


Models for North America 

The amount of the explained deviance (D’) in the 40 random GAMs varied between 
0.441 and 0.531, being on average highest in the models based on medium-term climate 
data and the extended set of 7 climate variables (Table 1). With regard to AUC from the 
resubstitution validation and AUC from the cross-validation, all the 40 GAMs showed 
an excellent model performance (Table 1). On average, medium-term extended models 
showed the highest AUC values but the difference from the long-term extended models 
was marginal. In fact, there were no statistically significant differences among these two 
main model types according to any of the four model performance criteria. The medi- 
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um-term and long-term baseline models differed significantly only with regard to their 
BIC values. However, the medium-term extended models performed significantly better 
on the basis of all four criteria than the medium-term baseline models, and similarly, 
the long-term extended models out-competed the long-term baseline models (Table 1). 

Both in the medium-term baseline and long-term baseline GAMs, the three cli- 
mate variables (MTCO, GDD5, WB) were selected in all models, and in each model 
GDD5 showed the highest model contribution. However, the medium-term and long- 
term extended GAMs were more variable in terms of the selected climate variables 
(Table 2). GDD5 and Defi_ann were selected in all the 20 extended models, followed 
closely by WB and Defi_sum (18 extended models). In the extended models, GDD5 
and Defi_sum appeared as the two most significant predictors of the distribution of 
Elodea canadensis, showing the highest model contributions (Table 2). 

Visual examination of the mean probabilities provided by the four main types of 
GAMs showed that the projections from the two baseline models differed very little, 
and those from the two extended models were also very similar to each other (Fig. 2). 
Extended models differed slightly from baseline models, for example, in that gener- 
ally they did not predict suitable areas for the species in Mexico, whereas the baseline 
models did (Fig 2). Despite their high performance, all the four types of main models 
failed to predict part of the northernmost known occurrences of the species correctly. 
However, they agreed in suggesting that Elodea canadensis has not yet spread into all 
the climatically suitable areas (with mean probability value > 0.50) in North America. 

Variability (standard deviation) in the per-grid-cell probability values between the 
random GAMs indicated that the projections from the extended models vary more than 
projections from the baseline models. ‘The standard deviation of the probability values in 
the medium-term baseline models ranged from 0.005 to 0.048 (Fig. 3a), in the medium- 
term extended models from 0.015 to 0.251 (Fig. 3b), and in the long-term baseline and 
long-term extended models from 0.006 to 0.048 and from 0.011 to 0.181, respectively. 


Table 2. Contributions of climate variables in the ‘extended’ models for Elodea canadensis in North 
America. (a) The number of times each climate predictor variable was selected in the 20 extended random 
GAMs based on medium-term (1961-1990) or long-term (1901-1990) time slices and an ‘extended’ set 
of seven climate variables, and (b) the number of times a given variable showed the highest model con- 
tribution in the models. The models were built separately 10 times both for medium-term climate data 


and long-term climate data. 


Predictors (a) selected (b) highest 
in the models model contribution 


MTCO _—— 0 
GDD5 20 8 
WB 18 0 
Temp yay 1B 0 
Temp 0 
Defi_sum 18 1 


Defi_ann 20 0 
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Figure 2. Projected distribution of Elodea canadensis in North America based on four different modelling 
approaches. The maps show the mean probability of occurrence derived from 10 random GAMs based 
on (a) medium-term climate data including 3 baseline variables, (b) long-term climate data including 3 
baseline variables, (¢c) medium-term climate data including an extended set of 7 variables, and (d) long- 
term climate data including an extended set of 7 variables. Known occurrence points for Elodea canadensis 
are shown in (a) with green dots. The maps are in the same scale. Probabilities > 0.5 indicate areas were 


the species is projected to occur. 


Consequently, the projections between the individual random-set GAMs based on ex- 
tended climate data differed to some extent, particularly in the geographically marginal 
areas, for example between random-set 2 (Fig. 3c) and random-set 5 (Fig. 3d). 


Models for Europe 

All the four main types of models developed for Elodea canadensis in North America 
predicted the known distribution in Europe very well. In the best case, the medium- 
term baseline models, only 2.7% of the known occurrences (51 out of 1881 occupied 
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Figure 3. Variation in probability values and probabilities of occurrence from two individual models in 
North America. The per-grid-cell variation shows the standard deviation of probability values for the oc- 
currence of Elodea canadensis derived from the 10 random GAMs based on (a) medium-term baseline cli- 
mate data, (b) medium-term extended climate data. Probabilities of occurrence based on two individual 
medium-term extended models were derived from (¢) a model based on pseudo-absence set 2, and (d) a 


model based on pseudo-absence set 5. Known occurrence points are shown in (a) and (b) with green dots. 


grid cells; ‘Table 3) were in areas predicted to be climatically suitable by < 80% of the 
random-set GAMs, whereas 97.3% of the occurrences of the species were in grid cells 
predicted as suitable by = 80% of the random GAMs (df = 1, chi-square = 1682.5, 
p<0.001). However, the differences between model types were marginal, as the other 
three main model types also showed a significantly high predictive ability (Table 3). 
The mean probabilities in the European grid cells with known occurrences were high- 
est in the long-term extended GAMs and lowest in the medium-term baseline GAMs 
(Table 3). The two statistically significant differences were that medium-term baseline 
GAMs had significantly lower mean probabilities in the grid cells with occurrences 
than the medium-term extended GAMs (paired t-test, df = 1880, t = -9.29, p<0.001) 
and the long-term baseline GAMs (df = 1880, t = -30.80, p<0.001). This discrepancy 
between the results of chi-square tests and mean probabilities in the occupied grid cells 
was caused by the higher variability in the performance of (separate) extended GAMs 
(Table 3). For example, the number of grid cells with known occurrences in Europe 
but predicted not to have the species varied in the separate medium-term extended 


GAMss from 46 to 85, but in the medium-term baseline GAMs from 49 to 53. 
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Table 3. Performance of the four main types of models for Elodea canadensis in Europe. The number of 
the grid cells with known occurrences of the species (a) predicted to have the species by < 80% of the 
random GAMs, (b) predicted to have the species by = 80% of the random GAMs, (c) range (min — max) 
in (a) among the 10 individual random GAMs, (qd) statistics from the chi-square test for (a) vs. (b), and 


(e) mean probability of occurrences averaged across the 1881 grid cells with known occurrences. 


(a) (b) (d) (e) 


No of No of Mean 
Main model type Cells<80% | Cells>80% x? probability 
Medium-term baseline models pl 1830 49-53 NGS255 3 te 0.815 
Long-term baseline models 54 1827 47-53 LEZ IE2 0 0.823 
Medium-term extended models 73 1808 46-85 h6OO3ee* 0.823 
Long-term extended models 52 1829 42-65 1678.75** 0.824 


Mean occurrence probabilities for Elodea canadensis in Europe from the 10 me- 
dium-term baseline GAMs and 10 long-term baseline GAMs showed a high spatial 
agreement (Fig. 4a—b). By contrast, long-term extended GAMs provided on average 
higher probability values than the medium-term extended GAMs (Fig. 4c—d). All the 
four main types of models identified the favourable northern range margin for the spe- 
cies very well (using the probability of 0.50 as a cut-off level for distinguishing which 
grid cells are climatically suitable for the species and which are not), and at maximum 
only 11 records occurred in areas predicted to be climatically unsuitable by the four 
model types. The models also agreed in predicting that the climatically suitable areas for 
the species extend into much wider areas in the Mediterranean countries and areas next 
to the Black Sea than indicated by the range map of Hultén and Fries (1986) (Fig. 4). 

The probabilities generated for all the 5083 grid cells in Europe showed much 
more variation among the individual extended models than the baseline models. For 
example, the standard deviation of the per-grid-cell probabilities in the medium-term 
extended models ranged from 0.005 to 0.351, whereas in the corresponding medium- 
term baseline GAMs the range was from 0.002 to 0.057 (Fig. 5a—b). The areas where 
the probabilities varied maximally were geographically and climatically marginal areas 
in the European study window. Consequently, the projections from solitary GAMs 
based on extended climate data differed occasionally considerably in these areas (Fig. 


5c—d). 


Models for Finland 

All the four main types of models showed a high predictive ability (chi-square = 264.13 
— 272.14, p<0.001: Table 4) for the climatically suitable areas for Elodea canadensis 
in Finland on the basis of the climate and species data from 1961-1984. In the two 
extended models, 99.6% (275 out of 276) of the grid cells with occurrences of Elodea 
canadensis were predicted to have the species by = 80% of the random GAMs, and the 
two baseline models performed almost equally well (Table 4). The mean probabilities 
derived from the two baseline types of models were very similar to each other (Fig. Ga— 
b), but the projections derived from the extended models differed slightly in some areas 
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Figure 4. Projected distribution of Elodea canadensis in Europe based on four different modelling ap- 
proaches. The maps show the mean probability of occurrence derived from 10 random GAMs calibrated 
with species and climate data from North America and transferred into Europe: (a) medium-term baseline 
models, (b) long-term baseline models, (¢) medium-term extended models, and (d) long-term extended 


models. Known occurrence points are shown in (a) with green dots. The maps are in the same scale. 


(Fig. Gc—d). The area predicted as climatically suitable by the extended models extended 
ca. 100-200 km further north than in the baseline models. However, there was more 
variation in the probability values derived from extended models than in those from the 
baseline models. Moreover, projections from the individual extended GAMs differed 
occasionally considerably (Fig. 6e—h), as also did the number of grid cells with known 
occurrences but predicted not to have the species by the individual models (Table 4). 
Transferring the models into the climate and species data from 1984-2006 indicated 
that the four types of models also have a high predictive ability in predicting the most recent 
distribution of Elodea canadensis in Finland (chi-square = 347.52 — 352.38, p<0.001: Table 
5). The majority of the new records for Elodea canadensis in Finland discovered in 1985— 
2006 were located in the areas projected as climatically suitable, but 6 or 7 new records 
occurred in the 10-km grid cells situated northwards from the area predicted as climatically 
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Figure 5. Variation in probability values and probabilities of occurrence from two individual models in 
Europe. The per-grid-cell variation shows the standard deviation of probability values for the occurrence 
of Elodea canadensis derived from the 10 random GAMs based on (a) medium-term baseline climate data, 
(b) medium-term extended climate data. Probabilities of occurrence based on two individual medium- 
term extended models were derived from (¢) a model based on pseudo-absence set 2, and (d) a model 


based on pseudo-absence set 5. Known occurrences are shown in (a) and (b) with green dots. 


Table 4. Performance of the four main types of models in Finland with data from 1961 - 1984. The 
number of the grid cells with known occurrences of Elodea canadensis (a) predicted to have the species by 
< 80% of the random GAMs, (b) predicted to have the species by > 80% of the random GAMs, (c) range 
(min — max) in (a) among the 10 individual random GAMs, (d) statistics from the chi-square test for (a) 


vs. (b), and (e) mean probability of occurrences averaged across the 276 grid cells with known occurrences. 


(a) (b) (e) 
No of No of Mean 
Main model type Cells<80% | Cells>80% probability 
Medium-term baseline models 264.13*** 0.645 
Long-term baseline models 268.05*** 0.669 
Medium-term extended models 272 Ole 0.698 
Long-term extended models 27 200 lan 0.669 


suitable (Fig. 7). Similarly as with the 1961-1984 data, the area predicted as climatically 
suitable extended further north in the extended models than in the baseline models, the 
extended models showed higher variation in their probability values, and projections from 


the solitary extended models differed occasionally considerably (Fig. 7, Table 5). 
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Figure 6. Projected distributions, variation in probability values and probabilities of occurrence from two 
individual models in Finland. Projected distributions show mean probability of occurrence for Elodea ca- 
nadensis in Finland, based on 10 random GAMs fitted to climate data from 1961-1984: (a) medium-term 
baseline models, (b) long-term baseline models, (¢) medium-term extended models, and (d) long-term 
extended models. The standard deviation of per-grid-cell probabilities derived from 10 random GAMs is 
shown for (e) the medium-term baseline models, and (f) the medium-term extended models. Probabilities 
of occurrence based on two individual medium-term extended models were derived from (g) a model based 
on pseudo-absence set 2, and (h) a model based on pseudo-absence set 5. Known occurrence points from < 


1985 are shown with green dots. The maps are in the same scale in (a) — (d), in (e) and (f), and in (g) and (h). 


Discussion 


According to Rahel and Olden (2008) there are few examples of geographic range shifts 
consistent with recent changes in climate in freshwater organisms, in contrast to several 
examples in terrestrial and marine species. Sporadic data includes observations of the re- 
cent invasion of Ranunculus trichophyllus in high-elevation lakes in the Himalayas, con- 
sidered as a signal of a warming climate (Lacoul and Freedman 2006), poleward range 
shifts in four freshwater taxa in UK during the recent period of climate warming (Hick- 
ling et al. 2006), and some solitary observations of new occurrences at high latitudes (for 
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Table 5. Performance of the four main types of models in Finland with data from 1985 — 2007. The 
number of the grid cells with known occurrences of Elodea canadensis (a) predicted to have the species by 
< 80% of the random GAMs, (b) predicted to have the species by = 80% of the random GAMs, (c) range 
(min — max) in (a) among the 10 individual random GAMs, (d) statistics from the chi-square test for (a) 


vs. (b), and (e) mean probability of occurrences averaged across the 375 grid cells with known occurrences. 


(a) (b) (d) (e) 
No of No of Mean 
Main model type Cells<80% | Cells>80% x probability 
Medium-term baseline models (4 368 6-8 | 347.52*** 0.691 
Long-term baseline models 6 369 6-6 | 351.38*** 0.711 
Medium-term extended models vA 368 2-7 | 347.52"* 0.749 


Long-term extended models 6 369 6-40 | 351.38" 0.703 


a review see Heino et al. 2009). This study contributes to this accumulating evidence and 
shows that Elodea canadensis, an introduced freshwater plant species, has recently spread 
northwards in northernmost Europe, in Finland, in concert with the recent climatic 
changes and in agreement with the predictions from bioclimatic envelope models. 

A number of earlier studies have reported the ability of ecological niche models and 
bioclimatic envelope models to predict the geographic occurrences of invasive fresh- 
water species in their native and introduced range, mainly for fish species (Iguchi et al. 
2004, Chen et al. 2007) and more rarely for aquatic plants (Peterson 2003a, Peterson 
et al. 2003). However, some recent studies have reported notable mismatches between 
the model projections developed for the invaded areas and the observed occurrences 
therein (Broennimann et al. 2007, Fitzpatrick et al. 2007). Such discrepancies may 
reflect the potentiality of invasive species to occupy climatically distinct niche spaces 
in the invaded areas (Broennimann et al. 2007), a phenomenon which would decrease 
the usefulness of niche —based models to assess the potential spread of introduced 
species. However, such mismatches did not occur in our results. Thus bioclimatic en- 
velope models appear to have the potentiality to produce useful first-filter predictions 
for the distribution of Elodea canadensis, and to identify the broad-scale geographical 
limits to the species’ spread and areas most vulnerable to invasions (cf. Peterson 2003a, 
Peterson et al. 2003, Herborg et al. 2007). 

All the four main types of models applied in this study provided accurate and 
statistically highly significant predictions of the occurrences of Elodea canadensis both 
in the native and invaded range. In Europe, the most notable mismatches between the 
model projections and the distribution map of Elodea canadensis by Hultén and Fries 
(1986) occurred in the Alps and in southernmost areas in Europe and adjacent areas 
around the Black Sea. All the four main model types predicted that there are no cli- 
matically suitable areas for the species in the Alps, whereas Hultén and Fries (1986) re- 
ported that the species occurs throughout this area. This discrepancy may be based on 
possible errors in the expert-drawn delineation of historical range of Elodea canadensis 
in areas with few known occurrence points (Habib et al. 2003, Graham et al. 2008), 
in other words, exaggerating the extent of occurrences in the Alps. Alternatively, the 
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Figure 7. Projected distributions, variation in probability values and probabilities of occurrence from 
two individual models in Finland. Projected distributions show mean probability of occurrence for Elodea 
canadensis in Finland, based on 10 random GAMs fitted to climate data from 1985-2006: (a) medium- 
term baseline models, (b) long-term baseline models, (¢) medium-term extended models, and (d) long- 
term extended models. The standard deviation of per-grid-cell probabilities derived from 10 random 
GAMs is shown for (e) the medium-term baseline models, and (f) the medium-term extended models. 
Probabilities of occurrence based on two individual medium-term extended models were derived from (g) 
a model based on pseudo-absence set 2, and (h) a model based on pseudo-absence set 5. Known occur- 


rence points from < 1985 are shown with green dots, and from 1985-2006 with blue dots. 


species may have been recorded to occur in the Alps in lakes situated in microclimati- 
cally sheltered valleys and at the base of the mountains, at altitudes over 750 meters 
a.s.l. (Unni 1977, Dubois et al. 1988). Bioclimatic envelope models generally use the 
mean values of climate variables averaged over the whole grid cell, and thus in topo- 
graphically heterogenous landscapes they may fail to detect the existence of sheltered, 
climatically suitable sites for the appearance of lowland species (cf. Peterson 2003b, 
Luoto and Heikkinen 2008). 

Our models suggested that the climatically suitable area for Elodea canadensis cov- 
ers much larger areas in southern Europe than those where the species was mapped by 
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Hultén and Fries (1986). This disagreement may be a result of the poor ability of our 
models to detect the southern range limit of the species, leading to consequent over- 
predictions in the model projections. However, more probably the species had not yet 
spread to these climatically suitable areas before the map of Hultén and Fries was pub- 
lished, as indicated by the recent observations of the species from Turkey (Akbulut et 
al. 2001) and northern Africa (Vila et al. 1999). Thus although Elodea canadensis was 
introduced in Europe in the 1830s and has spread effectively since then, it apparently 
still had not reached all climatically suitable areas in Europe and adjacent areas by the 
1980s. This suggests that the delimitation of the full climatic limits for the species can 
be subject to biases if made only on the basis of the invaded range. In a similar vein, 
Welk (2004) argued that a reliable prediction of the invaded range of Lythrym salicaria 
in North America was only possible using a large cumulative data set compiled during 
ca. 150 years of monitoring of the species range changes in North America. This calls 
for special caution in ecological niche modelling of recently introduced species based 
on invaded range only. 

Interestingly, in Finland about half a dozen new observations of Elodea canadensis 
discovered in 1985-2006 occurred up to a maximum of 300 km north from the areas 
predicted as climatically suitable. These new records suggest that either the species has 
been able to accelerate its spread towards northernmost Europe during recent years 
more than simulated by the bioclimatic models, or that a series of unusually warm 
years during the last ca. 15 years in Finland (Tuomenvirta 2004, Péyry et al. 2009) and 
elsewhere in Europe (Della-Marta et al. 2007) has enabled Elodea canadensis to make 
major dispersal jumps. Indeed, the northernmost records have been made quite re- 
cently, one in 1994 and the other in 2001, and thus these observations probably reflect 
the accumulating effect of the recent warm years. 

Three factors potentially affecting the performance of the bioclimatic envelope mod- 
els were examined here: temporal delimitation of the climate data, selection of climate 
variables, and interactions between multiple sets of pseudo-absences and increasing the 
number of predictor variables. Several studies have used climate data averaged over a 30 
year period (e.g. Huntley et al. 1995, Sykes et al. 1996, Hartley et al. 2006), although the 
species data might have been collected over a much longer time slice. We used climate 
predictor variables that were averaged both over a 30 year period and over a 80- (Europe) 
or 90- (North America) year period. However, the use of longer-term climate data (which 
better covered the time slice when the species occurrence records were made) did not 
significantly improve the model accuracy or affect the geographical predictions of suit- 
able areas. Thus climate data averaged over a 30 year time slice also appear to offer in our 
case useful predictors for species data collected over a longer time slice. However, it is not 
possible to assess the generality of this finding, i.e. whether it is a special case or could be 
applied to other corresponding bioclimatic modelling studies as well. Similar results are 
likely to emerge in studies where the climate data averaged over a shorter time period do 
not deviate much from the data averaged over a longer time period. If the two climate data 
sets deviate notably, differences in the model outputs may occur especially when the mod- 
els are fitted to the climate scenario data to assess potential future species distributions. 
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Additional factors which might critically affect the model performance are the 
time slice and the climate conditions under which the species data in the invaded range 
have been collected. In particular, if several occurrences have been made in climatically 
extreme years that have enabled notable dispersal jumps for the species, models based 
on climate data sets averaged over several decades can fail to predict correctly such 
abrupt changes in distribution range (Baker et al. 2000, Heikkinen et al. 2006b). ‘Thus 
in projecting the models to the invaded range, it should be noted that changes in the 
range margins may be related to climatically highly optimal short-term time periods 
and associated dispersal jumps of the species (cf. Mitikka et al. 2008), as also suggested 
by our results. 

Insufficient attention has been paid to the potential impacts of selection of cli- 
matic variables (Beaumont et al. 2005, Heikkinen et al. 2006b, Beaumont et al. 2007), 
although the observed discrepancies between model predictions and invaded ranges 
may be caused by the choice of climatic predictors or climate data sets used in the 
modelling (Peterson and Nakazawa 2008). Here, the extended models including seven 
climate variables showed significantly better model performance in North America 
(and in Finland) than the baseline models with three climate variables. The differences 
in spatial projections of the climatically suitable areas between the baseline and ex- 
tended models were slight in North America, but showed a tendency to become more 
noticeable when the models were transferred to Europe, particularly to Finland (cf. 
Thuiller 2003). In general, the success of the attempts to transfer species distribution 
models from one continent or region to another has varied between the studies. Some 
studies have reported a high success for the projections of the transferred models (e.g. 
Peterson 2003a, Iguchi et al. 2004, Chen et al. 2007), while some other studies have 
shown notable differences between the model predictions and observed species distri- 
butions (e.g. Fitzpatrick et al. 2007, Broennimann et al. 2007, Beaumont et al. 2009). 
Recent studies suggest that habitat models based on essential functional resources for 
the studied species could be transferred better in space than models that use indirect 
environmental variables, such as biotope types (Vanreusel et al. 2007), and that the 
transferability may be reduced due to the peculiarities of the study areas, such as differ- 
ences in the ranges of environmental factors and the varied impact of land-use history 
between the model calibration and model evaluation areas (Randin et al. 2006). 

With regard to individual climate variables, GDD5 was among the two most im- 
portant predictors of the distribution of Elodea canadensis in all models. GDD5 has 
been successfully employed in broad-scale modelling of the distribution of terrestrial 
plant species (Beerling et al. 1995, Huntley et al. 1995, Sykes et al. 1996), and it 
also appears to provide a useful predictor of the range limits and climatically suitable 
areas for invasive aquatic plants as well. In North America, Welk (2004) concluded 
that Lythrum salicaria, an invasive wetland species, is sensitive especially to variation 
in length of the growing season. Growing degree days, which is an indicator for the 
length and thermal intensity of the growing season, can thus be a highly useful predic- 
tor in delimiting the northern range boundaries for a wide range of plant species from 
different habitats. By contrast, mean temperature of the coldest month (MTCO) was 
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in most cases replaced in the extended models by one or more of the four additional 
variables. The lower explanatory power of MTCO for aquatic plants in comparison 
to terrestrial species is probably related to the fact that water bodies tend to mitigate 
the effects of extreme minimum temperatures, and moreover, ice cover isolates the 
overwintering submerged aquatic plants from the effects of extreme cold periods. An- 
nual water balance (WB) was selected almost equally as often as the two water deficit 
variables into the extended models, but made a lower model contribution than the two 
water deficit variables. 

Contrary to our expectations, July temperature did not appear as a significant 
predictor for Elodea canadensis. It is possible that the two water deficient variables that 
combine the impacts of temperature and precipitation better reflect the areas where 
water temperatures in inland watercourses are exposed to critical levels of warming, 
whereas the northern range limit for the species is more accurately determined by 
GDD5 than by July temperature. 

Overall, the improvements in model performance observed here between the base- 
line models and the extended models support the conclusions reached by Beaumont et 
al. (2005): more consideration should be paid to the selection of variables in order to 
identify those that have the greatest predictive power, and knowledge on the biology of 
the modelled species should be used as much as possible during the variable selection. 
In particular, the use of one baseline set of environmental variables which is readily at 
hand in multispecies modelling studies (instead of careful species-specific selection of 
variables) may result in suboptimal models being generated for some, or even many, of 
the species (Heikkinen et al. 2006b). 

However, inclusion of more predictor variables in ecological niche models can also 
cause drawbacks. Most importantly, excessive inclusion of predictors may complicate 
the interpretation of the importance and effect of individual predictor variables (Heik- 
kinen et al. 2004, Hartley et al. 2006), and result in over-fitting and overly complex 
models (Hartley et al. 2006). In addition, our results show that increasing the number 
of candidate predictors increases the uncertainty in model predictions in a hitherto 
rarely acknowledged way, i.e. via its interaction with the use of pseudo-absences. There 
are several approaches for generating pseudo-absence points (Pearce and Boyce 2006), 
including selecting points randomly (McPherson et al. 2004), randomly with case- 
weighting to reduce the effective sample size of pseudo-absences (Guisan et al. 2007), 
or via environmentally weighted random sampling (Zaniewski et al. 2002). This se- 
lection of approach can affect the outcomes of the models (Engler et al. 2004), but 
the pros and cons of different approaches remain open to debate (Chefaoui and Lobo 
2008). Here, following McPherson et al. (2004), the pseudo-absences were selected 
randomly. Our results show that increasing the number of predictors may notably in- 
crease the variability of the model projections based on different random sets of pseu- 
do-absences, due to the varying combinations of climate variables that were selected in 
the different extended GAMs. This variability between the individual extended models 
caused increased variation in the per-grid-cell probability values and in the spatial 
predictions of the suitable areas between the individual models (Fig. 3, 5, 7 and 9). 
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This suggests that in order to lower the risk of choosing an inappropriate set of pseudo- 
absence points and generating suboptimal models, multiple sets of pseudo-absences 
should be generated instead of using only one selection (Engler et al. 2004), as well as 
averages calculated across multiple models to provide consensus predictions (Hartley 
et al. 2006). Projections from multiple models allow quantification of the uncertainty 
in model predictions, which in turn assists the making of management decisions with 
greater certainty (Hartley et al. 2006). In our case, mapping the standard deviation of 
the per-grid-cell probability values provided a simple way to visualise where the predic- 
tions from the individual extended GAMs differed most and where they agreed most. 

It is obvious that identification of the detailed locations most at risk to the inva- 
sions by Elodea canadensis would benefit from including information on other factors 
in addition to climate (cf. Rahel and Olden 2008). Potentially useful additional predic- 
tors include factors describing the degree of human influence (population density, land 
transformation, presence of infrastructures etc.) (Ficetola et al. 2007), and physical and 
chemical characteristics of water bodies (cf. Buchan and Padilla 2000). In the case of 
Elodea canadensis particularly water chemistry might matter. Although the species has a 
relatively wide tolerance for water pH, it favours calcium- and nutrient-rich eutrophic 
waters (Weidema 2000). 

It is very likely that Elodea canadensis will continue to spread further north in Fin- 
land and elsewhere in northernmost Europe. ‘This is because the magnitude of the pro- 
jected climate change is particularly high in northern latitudes (ACIA 2005). Warming 
climate can reduce the extent of ice cover and cause warmer water temperatures in 
high latitude water bodies, and thus allow the further expansion of invasive aquatic 
species such as Elodea (Rahel and Olden 2008). In general, freshwater organisms are 
less capable of tracking the geographic shifts in climatic optima than terrestrial organ- 
isms (Rahel and Olden 2008). However, Elodea canadensis is probably better equipped 
to pass the four main barriers in the process of species invasion into new areas (see 
Hellmann et al. 2008) than aquatic species in general: i.e. effective passing of (1) Ge- 
ography—barrier (many dispersal vectors, long-lasting fragments), (2) Abiotic condi- 
tions —barrier (survives in invaded areas due to relatively wide tolerance capability), (3) 
Biotic interactions —barrier (able to compete effectively with native macrophytes and 
become dominant), and (4) Landscape factors —barrier (vegetative fragments spreading 
via watercourses and passive transportation by human activities and waterfowl) (Good- 
win et al. 1999, Barrat-Segretain et al. 2002, Richardson et al. 2007). 


Conclusions 

Our results suggest that bioclimatic envelope models can provide a useful first-step 
tool for the identification of areas most at risk to colonization by Elodea canadensis, 
and possibly also for other similar aquatic invasive species. Such models may help 
in targeting early preventive or ameliorative measures in a timely manner (Kriticos 
et al. 2003), planning and prioritizing of control measures (Weber 2001, Roura- 
Pascual et al. 2004), and inform us as to the potential further spread of the species 
across the new landscape (Chen et al. 2007). However, increasing attention should 
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be targeted to careful consideration and selection of environmental variables in- 
cluded in the models, generating consensus predictions based on multiple models 
(especially when employing pseudo-absences), and investigating and quantifying the 
geographic patterns of the uncertainties in the model predictions. These actions 
would help in improving the usefulness of bioclimatic envelope models, and eco- 
logical niche models in general, in predicting the distributions and range shifts of 
invasive aquatic species. 
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